Automatic Voice Disorder Detection Using Self-Supervised Representations

نویسندگان

چکیده

Many speech features and models, including Deep Neural Networks (DNN), are used for classification tasks between healthy pathological with the Saarbruecken Voice Database (SVD). However, accuracy values of 80.71% phrases or 82.8% vowels /aiu/ highest reported audio samples in SVD when evaluation includes wide amount pathologies database, instead a selection some pathologies. This paper targets this top performance state-of-the-art Automatic Disorder Detection (AVDD) systems. In framework DNN-based AVDD system we study capability Self-Supervised (SS) representation learning describing discriminative cues speech. The processes SS temporal sequence single feed-forward layer Class-Token (CT) Transformer obtaining Furthermore, there is evaluated suitable data extension training set out-of-domain also to deal low availability using models voice pathology detection. Experimental results corresponding dataset, all available, show until 93.36%. means that proposed achieved improvements 4.1% without extension, 15.62% after compared baseline system. Beyond novelty representations AVDD, fact accuracies over 90% these conditions whole milestone disorder-related research. on in-domain related guidance preparation stage. Lessons learned work suggest guidelines taking advantage DNN, boost developing automatic systems diagnosis, treatment, monitoring

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Image Orientation Detection Using the Supervised Self-organizing Map

Automatic detection and correction of image orientation is of great importance in intelligent image processing. In this paper, we present an automatic image orientation detection algorithm based on the supervised selforganizing map (SOM). The SOM is trained by using compact and efficient low-level chrominance (color) features in a supervised manner. Experiments have been conducted on a database...

متن کامل

Using Nonlinear Features for Voice Disorder Detection

In this paper we propose the use of nonlinear speech features to improve the voice quality measurement. We have tested a couple of features from the Dynamical System Theory, namely: the Correlation Dimension and the largest Lyapunov Exponent. In particular, we have studied the optimal size of time window for this type of analysis in the field of the characterization of the voice quality. Two sy...

متن کامل

Voice Disorder Detection Based on Automatic Speaker Identification Techniques

In this paper, we investigate the proprieties of automatic speaker identification (ASI) to develop a system for voice pathologies detection, where the models do not correspond to different speakers but it corresponds to classes of patients who share the same diagnostic. One essential part in this topic is the database (described later), the samples voices (healthy and pathological) are chosen f...

متن کامل

Automatic detection of voice creak

The analysis of large spontaneous speech corpora reveals that creaky mode appears more frequently than expected, especially for young female speakers. Creaky mode usually creates fundamental frequency measurement errors and creaky voice segments must be often identified manually beforehand to avoid erroneous reading of F0 in large speech databases. Various approaches have been proposed to ident...

متن کامل

Voice disguise and automatic detection

This study focuses on the question of voice disguise and the problem of its detection. The voice disguise is considered as a deliberated action of the speaker who wants to falsify or to conceal his identity. Lots of possibilities are offered to a speaker to change his voice and to false a human ear or an automatic system. He could transform his voice by electronic scrambling or more simply by e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3243986